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METHOD FOR PRODUCING MASS -CODED COMBINATORIAL LIBRARIES 

BACKGROUND OF THE INVENTION 

Genomics is identifying the genes responsible for all 
human functions and diseases. With 80,000 genes in the 
5 human genome, the thousands of genes involved in 

development, stature, intelligence, and other features of a 
human being are being defined. Humans suffer from hundreds 
of inherited and infectious diseases, and the genes 
involved in such are also being identified. Proteins 

10 encoded by all these genes are targets for therapeutic 
drugs. However, drugs that can be applied to human 
function and disease will not simply emerge from genomic 
information. Conventional drug development for a single 
disease is a lengthy, tedious and extremely expensive 

15 process. Technologies that eliminate the major hurdles 

facing drug development in the post -genomic era would be of 
substantial value. 

SUMMARY OF THE INVENTION 

The present invention provides a method for producing 

20 a mass -coded set of chemical compounds having the general 
formula X(Y) n , where X is a scaffold, each Y is, 
independently, a peripheral moiety, and n is an integer 
greater than 1, typically from 2 to about 6. The method 
comprises selecting a peripheral moiety precursor subset 

25 from a peripheral moiety precursor set. The subset 
includes a sufficient number of peripheral moiety 
precursors that at least about 50, 100, 250 or 500 distinct 
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combinations of n peripheral moieties derived from the 
peripheral moiety precursors in the subset exist. The 
subset of peripheral moiety precursors is selected so that 
at least about 90% of all possible combinations of n 

5 peripheral moieties derived from the subset of peripheral 
moiety precursors have a molecular mass sum which is 
distinct from the molecular mass sums of all of the other 
combinations of n peripheral moieties. The method further 
comprises contacting the peripheral moiety precursor subset 

0 with a scaffold precursor which has n reactive groups, each 
of which is capable of reacting with at least one 
peripheral moiety precursor to form a covalent bond. The 
peripheral moiety precursor subset is contacted with the 
scaffold precursor under conditions sufficient for the 

5 reaction of each reactive group with a peripheral moiety 
precursor, resulting in a mass -coded set of compounds of 
the general formula X(Y) n , 

In another embodiment, the invention provides a method 
of identifying a member or members of a mass -coded 

0 combinatorial library which are ligands for a biomolecule, 
for example, a protein or a nucleic acid molecule, such as 
•DNA or RNA. The method comprises the steps of (1) 
contacting the biomolecule with the mass-coded molecular 
library, whereby members of the mass-coded molecular 

5 library which are ligands for the biomolecule bind to the 
biomolecule to form biomolecule -ligand complexes and 
members of the mass-coded library which are not ligands for 
the biomolecule remain unbound; (2) separating the 
biomolecule- ligand complexes from the unbound members of 



WO 99/35109 



PCT/US99/00024 



the mass-coded molecular library; (3) dissociating the 
biomolecule-ligand complexes; and (4) determining the 
molecular mass of each ligand to identify the set of n 
peripheral moieties present in each ligand. 
5 In a further embodiment, the invention provides a 

method for identifying a member or members of a mass -coded 
molecular library which are ligands for a biomolecule and 
bind to the biomolecule at the binding site of a ligand 
known to bind the biomolecule (a known ligand) . The method 

10 comprises the steps of: (1) contacting the biomolecule with 
the mass -coded molecular library, so that members of the 
mass-coded molecular library which are ligands for the 
biomolecule bind to the biomolecule to form biomolecule- 
ligand complexes and members of the mass -coded library 

15 which are not ligands for the biomolecule remain unbound; 
(2) separating the biomolecule-ligand complexes from the 
unbound members of the mass -coded molecular library; (3) 
contacting the biomolecule-ligand complexes with a ligand 
known to bind the biomolecule, to dissociate biomolecule - 

20 ligand complexes in which the ligand binds to the 

biomolecule at the binding site of the known ligand, 
thereby forming biomolecule -known ligand complexes and 
dissociated ligands; (4) separating the dissociated ligands 
and biomolecule-ligand complexes; and (5) determining the 

25 molecular mass of each dissociated ligand to identify the 
set of n peripheral moieties present in each dissociated 
ligand . 

In a yet further embodiment, the invention provides a 
method for identifying a member or members of a mass -coded 
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combinatorial library which are ligands for a first 
biomolecule but are not ligands for a second biomolecule. 
The method comprises the steps of: (1) contacting the first 
biomolecule with the mass -coded molecular library, whereby 
5 members of the mass -coded molecular library which are 
ligands for the first biomolecule bind to the first 
biomolecule to form first biomolecule-ligand complexes and 
members of the mass -coded library which are not ligands for 
the first biomolecule remain unbound; (2) separating the 

10 first biomolecule-ligand complexes from the unbound members 
of the mass-coded molecular library; (3) dissociating the 
first biomolecule-ligand complexes; (4) determining the 
molecular mass of each ligand for the first biomolecule; 
(5) contacting the second biomolecule with the mass-coded 

15 molecular library, whereby members of the mass -coded 
molecular library which are ligands for the second 
biomolecule bind to the second biomolecule to form second 
biomolecule-ligand complexes and members of the mass -coded 
library which are not ligands for the second biomolecule 

2 0 remain unbound; (6) separating the second biomolecule- 
ligand complexes from the unbound members of the mass -coded 
molecular library; (7) dissociating the second biomolecule- 
ligand complexes; (8) determining the molecular mass of 
each ligand for the second biomolecule; and (9) determining 

25 which molecular masses determined in step (4) are not 

determined in step (8) . This provides the molecular masses 
of members of the mass-coded combinatorial library which 
are ligands for the first biomolecule, but are not ligands 
for the second biomolecule. 
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In another embodiment, the method for identifying a 
member or members of a mass -coded combinatorial library 
which are ligands for a first biomolecule but are not 
ligands for a second biomolecule comprises the steps of: 
5 (1) contacting the second biomolecule with the mass-coded 
molecular library, so that members of the mass -coded 
molecular library which are ligands for the second 
biomolecule bind to the second biomolecule to form second 
biomolecule-ligand complexes and members of the mass-coded 

10 library which are not ligands for the second biomolecule 
remain unbound; (2) separating the second biomolecule- 
ligand complexes from the unbound members of the mass-coded 
molecular library; (3) contacting the first biomolecule 
with the unbound members of the mass-coded molecular 

15 library of step (2) , whereby members of the mass -coded 
molecular library which are ligands for the first 
biomolecule bind to the first biomolecule to form first 
biomolecule-ligand complexes and members of the mass -coded 
library which are not ligands for the first biomolecule 

20 remain unbound; (4) dissociating the first biomolecule- 
ligand complexes; and (5) determining the molecular mass of 
each ligand for the first biomolecule. Each molecular mass 
determined corresponds to a set of n peripheral moieties 
present in a ligand for the first biomolecule which is not 

25 a ligand for the second biomolecule. 

In yet another embodiment, the present invention 
relates to a method for identifying a member of a mass- 
coded combinatorial library which is a ligand for a 
biomolecule and assessing the the effect of the binding of 
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the ligand to the biomolecule. The method comprises the 
steps of: contacting the biomolecule with the mass-coded 
molecular library, whereby members of the mass-coded 
molecular library which are ligands for the biomolecule 
5 bind to the biomolecule to form biomolecule-ligand 

complexes and members of the mass-coded library which are 
not ligands for the biomolecule remain unbound; separating 
the biomolecule-ligand complexes from the unbound members 
of the mass-coded molecular library; dissociating the 

10 biomolecule-ligand complexes; determining the molecular 
mass of each ligand to identify the set of n peripheral 
moieties present in each ligand. The molecular mass of 
each ligand corresponds to a set of n peripheral moieties 
present in that ligand, thereby identifying a member of the 

15 mass-coded combinatorial library which is a ligand for the 
biomolecule. The method further comprisies assessing in an 
in vivo or in vitro assay the effect of the binding of the 
ligand to the biomolecule on the function of the 
biomolecule . 

2 0 The method of the invention allows rapid production of 

mass-coded combinatorial libraries comprising large numbers 
of compounds. The mass-coding enables the identification 
of individual combinations of scaffold and peripheral 
moieties by molecular mass. The libraries prepared by the 

25 method of the invention also allow the rapid identification 
of compounds which are ligands for a given biomolecule. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A and IB are flow charts illustrating a 
procedure and alternative procedure, respectively, for 
selecting a subset of peripheral moiety precursors from 
5 among a larger set of peripheral moiety precursors for the 
production of a mass-coded combinatorial library. 

Figure 2A is a graph illustrating the mass redundancy 
of the combinatorial libraries resulting from a computer 
selected set of peripheral moiety precursors selected using 
10 a mass -coding algorithm. 

Figure 2B is a graph illustrating the mass redundancy 
of the combinatorial libraries resulting from a set of 
peripheral moiety precursors selected randomly. 

Figure 2C presents graphs illustrating the mass 
15 redundancy of the combinatorial libraries resulting from 

(1) a computer optimized set of peripheral moiety 
precursors selected using a mass-coding algorithm (...) and 

(2) a set of peripheral moiety precursors selected randomly 
(-) . 

2 0 Figure 3 is a schematic diagram of a computer system 

employing a digital processor assembly embodying the 
invention method of selecting a subset of peripheral moiety 
precursors which minimize or eliminate mass redundancy in a 
library. 

25 DETAILED DESCRIPTION OF THE INVENTION 

The major hurdles in drug development include a need 
for: 1) combinatorial chemistry technology that enables 
rapid production of nearly unlimited numbers of compounds 
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while incorporating the ability to identify efficiently 
single chemical compounds that bind tightly to a specific 
biomolecule target, such as a protein or nucleic acid 
molecule; 2) extremely efficient target -based screening 
5 technologies that permit rapid identification -of chemical 
compounds within a large library mixture that become 
tightly associated with a target biomolecule, even when the 
function of that biomolecule is not well understood and 3) 
an information data set that describes how chemical 
10 components interact with biomolecules of medical 
importance . 

The present invention provides a method of producing a 
mass-coded set of compounds, such as a mass-coded 
combinatorial library. The compounds are of the general 

15 formula X(Y) n , wherein X is a scaffold, each Y is a 
peripheral moiety and n is an integer greater than 1, 
typically from 2 to about 6. The term "scaffold", as used 
herein, refers to a molecular fragment to which two or more 
peripheral moieties are attached via a covalent bond. The 

2 0 scaffold is a molecular fragment which is common to each 
member of the mass-coded set of compounds. The term 
"peripheral moiety" , as used herein, refers to a molecular 
fragment which is bonded to a scaffold. Each member of the 
set of mass-coded compounds will include a combination of n 

25 peripheral moieties bonded to the scaffold and this set of 
compounds forms a mass coded combinatorial library. 

The term "combination", as used herein, refers to all 
permutations of m moieties having n members where m is an 
integer greater than 2, n is an integer greater than 1 and 
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m is greater than or equal to n, such that: 

(1) Permutations having n members in which a given moiety 
is present from 0 to n times are included. 

(2) Permutations having the same n moieties but ordered 
5 differently are included once and only once. 

The number of combinations of all permutations of m 
moieties having n members may be calculated from the 
formula: 

Combinations = k! / ((k-n)!*n!) where k = m + (n-1) 

10 For example, the combinations of the four moieties labeled 
A, B, C, D which have 3 members are: AAA; A A B; A A C; 
A A D; A B B; A B C; A B D; A C C; A C D; ADD; B B B; 
B B C; B B D; B C C; B C D; B D D; C C C; CCD; C D D 
and D D D. BAA and ABA, for example, are not counted 

15 as separate combinations; only A A B is counted. In this 
example, m = 4, n = 3 and the number of combinations is 
given by 

6! / ( (6-3) !*3!) =20. 

20 The terms "mass -coded set of compounds" and "mass- 

coded combinatorial library" , as used herein, refer to a 
set of compounds of the formula XY n/ where X is a scaffold, 
each Y is, independently, a peripheral moiety and n is an 
integer greater than 1, typically from 2 to about 6. Such 

25 a set of compounds is synthesized as a mixture by the 

combination of a set of peripheral moiety precursors with a 
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scaffold precursor, and is designed to possess minimum mass 
redundancy, given the requirement that a fixed number 
(subset) of peripheral moiety precursors must be chosen 
from a set of available peripheral moiety precursors. 
5 The term "mass" or "molecular mass 1 , as used herein, 

refers to the exact mass of a molecule or collection of 
chemical moieties in which each atom is the most abundant 
naturally occurring isotope for the particular element. 
Exact masses and their determination by mass spectrometry 

10 are discussed by Pretsch et al . , Tables of Spectral Data 
for Structure Determination of Organic Compounds, second 
edition, Springer-Verlag (1989), and Holden et al . , Pure 
Appl. Chem. 55 : 1119-1136 (1983), the contents of each of 
which are incorporated herein by reference in their 

15 entirety. 

"Minimum mass redundancy" , as the term is used herein, 
is exhibited by a set of compounds of the formula X(Y) n 
formed by reaction of a scaffold precursor having n 
reactive groups, where n is an integer greater than 1, 

2 0 typically from 2 to about 6, with a subset of peripheral 
moiety precursors in which at least about 90% of the 
possible combinations of n peripheral moieties derived from 
the subset of peripheral moiety precursors have a molecular 
mass sum which is distinct from the molecular mass sum of 

25 any other combination of n peripheral moieties derived from 
the subset. The molecular mass sum of a combination of 
peripheral moieties is the sum of the masses of each 
peripheral moiety within the combination. For the present 
purposes, two molecular masses are distinct if they can be 
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distinguished by mass spectrometry or high resolution mass 
spectrometry- For example, molecular masses which differ 
by at least 0.001 atomic mass units can be distinguished by 
high resolution mass spectrometry. 
5 It is to be understood that the molecular mass sum of 

the combination of the n peripheral moieties in a 
particular compound of the formula X(Y) n is the collective 
contribution of the n peripheral moieties to the molecular 
mass of the compound. As each compound within the set 

10 includes a constant scaffold, the difference in the 

molecular masses of two compounds within the mass-coded set 
of compounds is the difference in the molecular mass sums 
of the set of peripheral moieties in each compound. 

The method of the invention comprises selecting a 

15 peripheral moiety precursor subset from a larger peripheral 
moiety precursor set. Details of the preferred selection 
process are discussed later with reference to Figures 1A, 
IB and 3 . The subset includes a sufficient number of 
peripheral moiety precursors so that, in one embodiment, at 

20 least about 50 distinct combinations of n peripheral 

moieties derived from the peripheral moiety precursors in 
the subset can be formed. In another embodiment, at least 
about 100 distinct combinations of n peripheral moieties 
can be formed. In a further embodiment, at least about 2 50 

2 5 distinct combinations of n peripheral moiety precursors can 
be formed, and, in yet another embodiment, at least about 
500 distinct combinations of n peripheral moieties can be 
formed. 

The subset of peripheral moiety precursors is selected 
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so that at least about 90% of all possible combinations of 
n peripheral moieties derived from the subset have a 
molecular mass sum which is distinct from the molecular 
mass sums of all of the other combinations of n peripheral 
5 moieties. The method further comprises contacting the 
peripheral moiety precursor subset with a scaffold 
precursor which has n reactive groups, each of which is 
capable of reacting with at least one peripheral moiety 
precursor to form a covalent bond. The peripheral moiety 

10 precursor subset is contacted with the scaffold precursor 
under conditions sufficient for the reaction of each 
reactive group with a peripheral moiety precursor, 
resulting in a mass-coded set of compounds. 

In one embodiment, at least about 95% of all possible 

15 combinations of n peripheral moieties derived from the 
peripheral moiety precursor subset have a molecular mass 
sum which is distinct from the molecular mass sums of all 
of the other combinations of n peripheral moieties. In 
another embodiment, each of the possible combinations of n 

20 peripheral moieties derived from the subset has a molecular 
mass sum which is distinct from the molecular mass sums of 
all of the other combinations of n peripheral moieties. 

The scaffold precursor can be any molecule comprising 
two or more reactive groups which are capable of reacting 

25 with a peripheral moiety precursor reactive group to form a 
covalent bond. For example, suitable scaffold precursors 
can have a wide range of sizes, shapes, degrees of 
flexibility and charges. The reactive groups should be 
incapable of intramolecular reaction under the conditions 
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employed. Further, a scaffold precursor molecule should 
not react with another scaffold precursor molecule under 
the conditions employed. The scaffold precursor can also 
include any additional functional groups which are masked 
5 or protected or which do not interfere with the reaction of 
the reactive groups with the peripheral moiety precursors. 

Preferably, the scaffold precursor comprises one or 
more saturated, partially unsaturated or aromatic cyclic 
groups, such as a cyclic hydrocarbon or heterocyclic group. 

10 In scaffold precursors comprising two or more cyclic 

groups, the cyclic groups can be fused, connected via a 
direct bond or connected via an intervening group, such as 
an oxygen atom, an NH group or a C^-alkylene group. At 
least one cyclic group is substituted by one or more 

15 reactive groups. The reactive groups can be attached to 

the cyclic group directly or via an intervening group, such 
as a Ci.g-alkylene group, preferably a methylene group. 

Examples of suitable scaffold precursors include 
reactive group-substituted benzene, biphenyl, cyclohexane, 

2 0 bipyridyl, N- phenyl pyrrole , diphenyl ether, naphthalene and 
benzophenone . Other suitable classes of scaffold 
precursors are shown below. 
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In these examples, each of the indicated substituents R is, 
independently, a reactive group, and the scaffold precursor 
can include one or more additional functional groups which 
5 are either (1) masked or protected to prevent their 

reaction with a peripheral moiety precursor (e.g., scaffold 
precursors f and g, above) or (2) do not react either with 
R or with a peripheral moiety precursor under the given 
reaction conditions (e.g., scaffold precursor h, above, in 

10 which R = C(0)0(C 6 F 5 ) and the peripheral moiety precursors 
include primary amino groups) . 

A peripheral moiety precursor is a compound which 
includes a reactive group which is complementary to one or 
more of the reactive groups of the scaffold precursor. In 

15 addition to the reactive group, a peripheral moiety 
precursor can include a wide variety of structural 
features. For example, the peripheral moiety precursor can 
include one or more functional groups in addition to the 
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reactive group. Any additional functional group should be 
appropriately masked or not interfere with the reaction 
between the scaffold precursor and the peripheral moiety 
precursor. In addition, two peripheral moiety precursors. 
5 should not react together under the conditions employed. 
For example, a subset of peripheral moiety precursors can 
include, in addition to the reactive groups, 
functionalities selected from groups spanning a range of 
charge, hydrophobicity/hydrophilicity , and sizes. For 

10 example, the peripheral moiety precursor can include a 

negative charge, a positive charge, a hydrophilic group or 
a hydrophobic group. 

In addition to the reactive groups, peripheral moiety 
precursors can include, for example, functionalities 

15 selected from among amino acid side chains, a nucleotide 
base or nucleotide base analogue, sugar moieties, 
sulfonamides, peptidomimetic groups, charged or polar 
functional groups, alkyl groups and aryl groups. 

For the present purposes, two reactive groups are 

2 0 complement airy if they are capable of reacting together to 
form a covalent bond. In a preferred embodiment, the bond 
forming reactions occur rapidly under ambient conditions 
without substantial formation of side products. 
Preferably, a given reactive group will react with a given 

25 complementary reactive group exactly once. 

In one embodiment, the reactive group of the scaffold 
precursor and the reactive group of the peripheral moiety 
precursor react, for example, via nucleophilic 
substitution, to form a covalent bond. In one embodiment, 
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the reactive group of the scaffold precursor is an 
electrophilic group and the reactive group of the 
peripheral moiety precursor is a nucleophilic group. In 
another embodiment, the reactive group of the scaffold 
5 precursor is a nucleophilic group, while the reactive group 
of the peripheral moiety precursor is an electrophilic 
group . 

Complementary electrophilic and nucleophilic groups 
include any two groups which react via nucleophilic 

10 substitution under suitable conditions to form a covalent 
bond. A variety of suitable bond-forming reactions are 
known in the art. See, for example, March, Advanced 
Organic Chemistry, fourth edition, New York: John Wiley and 
Sons (1992), Chapters 10 to 16; Carey and Sundberg, 

15 Advanced Organic Chemistry, Part B, Plenum (1990) , Chapters 
1-11; and Collman et al., Principles and Applications of 
Or gano transition Metal Chemistry, University Science Books, 
Mill Valley, CA (1987), Chapters 13 to 20; each of which is 
incorporated herein by reference in its entirety. Examples 

20 of suitable electrophilic groups include reactive carbonyl 
groups, such as carbonyl chloride (acyl chloride) and 
carbonyl pentaf luorophenyl ester groups, reactive sulfonyl 
groups, such as the sulfonyl chloride group, and reactive 
phosphonyl groups. Other electrophilic groups which can be 

2 5 used include terminal epoxide groups and the isocyanate 
group. Suitable nucleophilic groups include primary and 
secondary amino groups and alcohol (hydroxyl) groups. 

Examples of suitable scaffold precursors with 
specified reactive groups are shown below. 
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Cl 



In these examples, each R is, independently, an additional 
reactive group which can be the same as the specified 
reactive group or a different group. 
5 Illustrated below are examples of suitable peripheral 

moiety precursors having amino groups . 
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R in this case is an amino acid side chain, t Boc is 
^utoxycarbonyl , Ac is acetyl and fc Bu is tertiary butyl. 
5 Examples of scaffold precursors and peripheral moiety 

precursors which have complementary reactive groups include 
the following, which are provided for the purposes of 
illustration and are not to be construed as limiting in any 
way: 

10 1. The scaffold precursor includes from two to about six 
reactive carbonyl groups, reactive sulfonyl groups or 
reactive phosphonyl groups, or a combination thereof. Each 
peripheral moiety precursor includes a primary or secondary 
amino group which reacts with the scaffold precursor to 

15 form an amide, sulfonamide or phosphonamidate bond. 
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2. The scaffold precursor includes from two to about six 
primary or secondary amino groups or a combination thereof. 
Each peripheral moiety precursor includes a reactive 
carbonyl group, a reactive sulfonyl group or a reactive 

5 phosphonyl group . 

3. The scaffold precursor includes from two to about six 
terminal epoxide groups. Each peripheral moiety precursor 
includes a primary or secondary amino group. In the 
presence of a suitable Lewis acid, the scaffold precursor 

10 and the peripheral moiety precursors react to form 3-amino 
alcohols . 

4. The scaffold precursor includes from two to about six 
primary or secondary amino groups. Each peripheral moiety 
precursor contains a terminal epoxide group. 

15 5. The scaffold precursor includes from two to about six 
isocyanate groups. Each peripheral moiety precursor 
contains a primary or secondary amino group which reacts 
with the scaffold precursor to form a urea. 

6. The scaffold precursor includes from two to about six 
20 primary or secondary amino groups, or a combination 

thereof. Each peripheral moiety precursor contains an 
isocyanate group. 

7. The scaffold precursor includes from two to about six 
isocyanate groups. Each peripheral moiety precursor 
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contains an alcohol group which reacts with the scaffold 
precursor to form a carbamate. 

8 . The scaffold precursor includes from 2 to about 6 

i aromatic bromides. Each peripheral moiety precursor is an 
5 organo-tributyl-tin compound. The scaffold precursor and 
the peripheral moiety precursors are reacted in the 
presence of a suitable palladium catalyst to form one or 
more carbon- carbon bonds. 

9. The scaffold precursor includes from 2 to about 6 
10 aromatic halides or triflates. Each peripheral moiety 

precursor includes a primary or secondary amino groups. 
The scaffold precursor and the peripheral moiety precursors 
are reacted in the presence of a suitable palladium 
catalyst to form one or more carbon-nitrogen bonds. 

15 10. The scaffold precursor includes from two to about six 
amino groups . Each peripheral moiety precursor contains an 
aldehyde or ketone group which reacts with the scaffold 
precursor under reducing conditions (reductive amination) 
to form an amine. 

20 11. The scaffold precursor includes from two to about six 
aldehyde or ketone groups . Each peripheral moiety 
precursor contains an amino group which reacts with the 
scaffold precursor under reducing conditions (reductive 
amination) to form an amine. 



WO 99/35109 



PCT/US99/00024 



12. The scaffold precursor includes from two to about six 
phosphorous ylide groups. Each peripheral moiety precursor 
contains an aldehyde or ketone group which reacts with the 
scaffold precursor (Wittig type reaction) to form an 

5 alkene. 

13. The scaffold precursor includes from two to about six 
aldehyde or ketone groups. Each peripheral moiety 
precursor contains a phosphorous ylide group which reacts 
with the scaffold precursor (Wittig type reaction) to form 

10 an alkene . 

The scaffold is that portion of the scaffold precursor 
which remains after each reactive group of the scaffold 
precursor has reacted with a peripheral moiety precursor, 
A peripheral moiety is that portion of the peripheral 

15 moiety precursor which is bonded to the scaffold following 
the bond- forming reaction. A peripheral moiety which 
results from the reaction of a particular peripheral moiety 
precursor with a reactive functional group of a scaffold 
precursor is said to be "derived" from that peripheral 

2 0 moiety precursor. 

A peripheral moiety precursor can include one or more 
functional groups in addition to the reactive group. One 
or more of these additional functional groups can be 
protected to prevent undesired reactions of these 

25 functional groups. Suitable protecting groups are known in 
the art for a variety of functional groups (Greene and 
Wuts, Protective Groups in Organic Synthesis , second 
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edition, New York: John Wiley and Sons (1991) , incorporated 
herein by reference) . Particularly useful protecting 
groups include t -butyl esters and ethers, acetals, trityl 
ethers and amines, acetyl esters, trimethylsilyl ethers and 
5 trichloroethyl ethers and esters. 

The compounds within the set are mass -coded as a 
result of the selection of a subset of suitable peripheral 
moiety precursors. The subset of peripheral moiety 
precursors is selected such that for a scaffold precursor 

10 having n reactive groups, where n is an integer from 2 to 
about 6, there exist at least about 50, 100, 250 or 500 
different combinations of n peripheral moieties derived 
from the peripheral moiety precursor subset. At least 
about 90% of the possible combinations of n peripheral 

15 moieties derived from the peripheral moiety precursors 
within the subset will have a distinct mass sum. In one 
embodiment, the selection of suitable peripheral moiety 
precursors for the production of a mass-coded set of 
compounds includes one or more automated steps utilizing 

2 0 hardware apparatus, software apparatus or any combination 
thereof. In the preferred embodiment, a digital processor 
assembly employs a suitable software routine which selects 
a subset of peripheral moiety precursors which minimize or 
eliminate mass redundancy in the library. Figure 3 is 

25 illustrative of such apparatus employing a digital 

processor assembly for carrying out the present invention 
method. 

Referring to Figure 3, there is shown a computer 
system 25 formed of (a) a digital processor 11 having 
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working memory 17 for executing programs, routines, 
procedures and the like, (b) input means 21 coupled to the 
digital processor 11 for providing data, parameters and the 
like to support execution of the programs, routines and/or 
5 procedures in the digital processor working memory 17, and 
(c) output means 23 coupled to the digital processor 11 for 
displaying results, prompts, messages and the like from 
operation of the digital processor 11. The input means 21 
include a keyboard, mouse and the like common in the art. 
10 The output means 23 include a viewing monitor, printer and 
the like common in the art. The invention software routine 
2 7 is executed in the working memory 17 by the digital 
processor 11 as follows. 

First, a user interface prompts the end-user to input 
15 indications of an initial set 13 of peripheral moiety 

precursors and the exact masses of the peripheral moieties 
which are derived therefrom. This initial set 13 may be 
copied, transferred or otherwise obtained from a database 
or other source such as is known in the art. The user 
2 0 interface also obtains from the end-user a set of user 
determined/desired criteria 19. In the preferred 
embodiment, the user selected criteria 19 includes (i) the 
total count j of peripheral moiety precursors in the 
initial set, (ii) the value of n indicating the number of 
25 reactive groups of a subject scaffold precursor for which 
the invention software routine 2 7 is to select a subset of 
peripheral moiety precursors from the input initial set 13 
and (iii) the number of members of the subset, k. 
Preferably, the user interface enables the end-user to 
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interactively provide the user selected criteria 19 through 
the input means 21 as indicated at 15 in Figure 3. 

The digital processor 11 is responsive to the 
foregoing input and stores the indications of the initial 
5 set. 13 of peripheral moiety precursors in a memory area 29 
or data storage system associated locally or off disk with 
the software routine 27. That is, the memory area 29 or 
data storage system supports the invention software routine 
27. For each peripheral moiety precursor in the initial 

10 set 13 as indicated in memory area 29, an identifier and 
indication of respective exact mass of the the peripheral 
moiety derived from the peripheral moiety precursor is 
provided to the software routine 27. Upon receipt of the 
peripheral moiety precursor identifiers, indications of 

15 exact mass, and user selected criteria (n, j and k) , the 
software routine 27 determines and generates a subset of k 
peripheral moiety precursors which minimize or eliminate 
mass redundancy in a resulting library of compounds of the 
formula XY n , wherein X is a scaffold, each Y is, 

20 independently, a peripheral moiety, and n is an integer 
greater than 1, typically from 2 to about 6. Preferably, 
the software routine 2 7 determines a subset of peripheral 
moiety precursors in which at least about 90% of the 
possible combinations of n peripheral moieties derived from 

25 the subset have a distinct mass sum. The details of the 
software routine 2 7 employed in the preferred embodiment 
are discussed next for purposes of illustration and not 
limitation. It is understood that other software or 
firmware routines for accomplishing the present invention 
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method of selecting a subset of the initial set 13 of 
peripheral moiety precursors are suitable and within the 
purview of one skilled in the art given this disclosure. 

A typical situation involves a scaffold precursor with 

5 n reactive groups, where n is an integer, a set of j 

peripheral moiety precursors, where j is an integer 6 or 
greater, where the peripheral moieties derived from the 
peripheral moiety precursors have molecular masses y lf 
y 2 , . . .yj. An example of a software routine which can be 

0 employed to select a suitable subset of k peripheral moiety 
precursors (k^j) from the set of j peripheral moiety 
precursors includes the following steps: 

1. From an initial set of j peripheral moiety precursors, 
choose every set of two peripheral moiety precursors. 

5 if y a = y b , randomly remove either y a or y b . 

2 . From the remaining set of peripheral moiety 
precursors, choose every set of four peripheral moiety 
precursors. If y a + y b = y c + y d , randomly remove 
either y a , y b , y c or y d . 

0 3 . From the remaining set of peripheral moiety 

precursors, choose every set of six peripheral moiety 
precursors. If y a + y b + y c = y d + y e + y f , randomly 
remove either y a , y b , y c , y d , y e or y f . 

5 If at any step 1 through 3 the remaining number of 

peripheral moiety precursors becomes < k, then there 
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is no mass coded subset k which can be made from set 
j, and a new set j must be employed. 



4 . From the remaining computer selected set of peripheral 
moiety precursors, choose any or all subsets of k 

5 peripheral moiety precursors. 

5. Generate all possible combinations of n peripheral 
moiety precursors from this subset. 

6. If the % mass redundancy of the resulting set of 

10 combinations is found to be unacceptable, repeat step 

5 until a desired mass coded library has been obtained 
or no further possible combinations of peripheral 
moiety precursors remain. In the latter case, begin 
again with step 1. 



15 Once an above subset of mass -coded peripheral moiety 

precursors is determined, the scaffold precursor is 
contacted with the subset of complementary peripheral 
moiety precursors under conditions suitable for bond- 
forming reactions to occur between the peripheral moiety 

2 0 precursors and the scaffold precursor. The mass -coded set 
of compounds is, preferably, synthesized in solution as a 
combinatorial library. 

The foregoing selection of a subset from a larger 
peripheral moiety precursor set and generation of a mass- 

2 5 coded set of compounds using the selected subset is more 
generally illustrated in Figures 1A and IB. Referring to 
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Figure 1A, the larger set of peripheral moiety precursors 
is provided at 31 from known sources. The end-user (e.g., 
chemist) selects an initial set of j peripheral moiety 
precursors from the larger set 31 at step 33. Typically 
5 the chemist chooses all of the larger set to form the 
initial set at 33. The invention mass coding selection 
procedure 35 is applied to the initial set. The result of 
the mass -coding procedure 3 5 is a subset 37 of peripheral 
moiety precursors that satisfies the mass-coding criteria 

10 outlined above. In step 39, this subset of peripheral 
moiety precursors is used to generate all theoretical 
subsets of of k peripheral moiety precursors. Also in step 
39, the mass redundancies of the libraries obtained from 
all theoretical subsets of k peripheral moieties are 

15 calculated, and only those subsets which yield mass -coded 
libraries, as defined above, are passed to 41. The net 
result is one or more subsets 41 of k peripheral moiety 
precursors in which there are 50, 100, 250, or 500 distinct 
combinations of n peripheral moiety precursors in a given 

20 subset and at least 90% of all possible combinations of n 
peripheral moieties derived from a given subset have a 
molecular mass sum which is distinct from the molecular 
mass sums of all of the other combinations of n peripheral 
moieties, as discussed above. The subset (s) 41 of 

25 peripheral moiety precursors would subsequently yield mass- 
coded sets of compounds when contacted with an appropriate 
scaffold precursor in the manner discussed above. 

As an alternative to the single-step application of 
the invention mass-coding selection procedure 35 in Figure 
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1A, multiple or stepped application of procedure 35 is 
suitable and in certain cases may be advantageous. For 
instance, using mass-coding procedures at each level allows 
for rapid sorting into distinct sets, each of which may 
5 yield optimal mass-coding. During the mass-coding process, 
certain criteria reduce the set size as it is passed into 
the next layer through mass- coding. This multi -layer 
approach yields advantages in speed and the elimination of 
mass redundancy. 

10 Multiple application of mass-coding selection 

procedure 35 on initial set 33 is illustrated in Figure IB. 
Here initial set 33 is divided into plural parts (the 
starting larger set of peripheral moiety precursors 31 and 
chemist selection 33 being similar to that in Fig. 1A) . 

15 The mass-coding selection procedure 35 is applied to each 
plural part and results in intermediate resultant sets 43A, 
43B, 43C. The mass-coded selection procedure 35 is applied 
in a second round/level, but this time with intermediate 
resultant sets 43A, 43B, 43C. This produces final sets 

20 45A, 45B, 45C. Step 3 9 is as in Figure 1A and generates 

the subsets 47A, 47B, 47C of k peripheral moiety precursors 
that would subsequently yield mass-coded stes of compounds 
when contacted with an appropriate scaffold precursor in a 
manner discussed above. 

25 It is understood that other variations between the 

approach illustrated in Figure 1A and that in Figure IB are 
within the purview of one skilled in the art. The 
foregoing discussion and Figures are for purposes of 
illustrating and not limiting the present invention method. 
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In one embodiment, the scaffold precursor is contacted 
with all members of the peripheral moiety precursor subset 
simultaneously. In general, a scaffold precursor having n 
reactive groups, where n is an integer from 2 to about 6, 
5 will be contacted with at least about n molar equivalents 
relative to the scaffold precursor of peripheral moiety 
precursors from the selected subset. For example, the 
scaffold precursor can be contacted with a solution 
comprising each member of the subset in approximately equal 

10 concentrations. For example, if the scaffold precursor 

includes n reactive groups, where n is an integer greater 
than 1, and the number of peripheral moiety precursors in 
the subset is denoted by p, the scaffold precursor can be 
contacted with about n/p to about (1.1) n/p molar 

15 equivalents of each peripheral moiety precursor. 

In another embodiment, the scaffold precursor is 
contacted with the members of the peripheral moiety 
precursor subset sequentially. This results in the 
formation of intermediate partially reacted scaffold 

2 0 precursor molecules which include at least one peripheral 
moiety and at least one reactive group. For example, the 
scaffold precursor can be contacted with one or more 
peripheral moiety precursors under conditions suitable for 
bond formation to occur. The resulting intermediates can 

2 5 then be contacted with one or more additional peripheral 
moiety precursors under suitable conditions for bond 
formation to occur. These steps can be repeated until each 
scaffold precursor reactive group has reacted with a 
peripheral moiety precursor. 
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In one embodiment, the reactive groups of the scaffold 
precursor can react sequentially with the subset of 
peripheral moiety precursors using a suitable reactive 
group protection/deprotection scheme. For example, the 
5 scaffold precursor can include two or more sets of reactive 
groups, where one set is unprotected and another set is 
protected, or where two sets are masked by different 
protecting groups. An example is the use of the scaffold 
precursor 



10 




which contains one unprotected reactive group and two 
protected reactive groups. In this case, the unprotected 
pentaf luorophenyl ester can react with a peripheral moiety 
precursor first (e.g., a primary amine). Either the 

15 Cl 3 CCH 2 0-protected group or the benzyloxy-protected group 
can then be deprotected using standard methods and reacted 
with a set of peripheral moiety precursors. Finally, the 
remaining protected group or groups can be deprotected and 
reacted with a set of peripheral moiety precursors. 

20 Following the reaction of each scaffold precursor 

reactive group with a peripheral moiety precursor/ any 
peripheral moiety having a protected functional group can 
be deprotected using methods known in the art. 

The ability to identify individual scaffold plus 

25 peripheral moiety combinations derived from such a mixture 



WO 99/35109 



PCT/US99/00024 



is a consequence of the mass -coding of the library and the 
ability of mass spectrometry to identify a molecular mass. 
This allows the identification of individual scaffold plus 
peripheral moiety combinations within the set which have a 
5 particular activity, such as binding to a particular 
biomolecule. 

In one embodiment, the present invention provides a 
method for identifying a compound or compounds within a 
mass-coded combinatorial library which bind to, or are 

10 ligands for, a biomolecule, such as a protein or nucleic 

acid molecule. The mass-coded combinatorial library can be 
produced, for example, by the method of the invention 
disclosed above. The target biomolecule, such as a protein, 
is contacted with the mass-coded combinatorial library, 

15 and, if any members of the library are ligands for the 

biomolecule, biomolecule -ligand complexes form. Compounds 
which do not bind the biomolecule are separated from the 
biomolecule- ligand complexes. The biomolecule -ligand 
complexes are dissociated and the ligands are separated and 

20 their molecular masses are determined. Due to the mass- 
coding of the combinatorial library, a given molecular mass 
is characteristic of a unique combination of peripheral 
moieties or only a small number of such combinations. 
Thus, a ligand f s molecular mass allows the determination of 

25 its composition. 

In one embodiment, the target is immobilized on a 
solid support by any known immobilization technique. The 
solid support can be, for example, a water- insoluble matrix 
contained within a chromatography column or a membrane. 
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The mass-coded set of compounds can be applied to a water- 
insoluble matrix contained within a chromatography column. 
The column is then washed to remove non-specific binders. 
Target -bound compounds (ligands) can then be dissociated by 
5 changing the pH, salt concentration, organic solvent 

concentration, or other methods, such as competition with a 
known ligand to the target. The dissociated ligands are 
injected directly onto a reverse phase column. The reverse 
phase column acts as a concent rator/col lector and can be 

10 interfaced directly to a mass spectrometer, such as an 

electrospray mass spectrometer (ES-MS) • Mass information 
provided by the mass spectrometer is sufficient for 
identifying the combination of scaffold and peripheral 
moieties within the ligand. 

15 In another embodiment, the target is free in solution 

and is incubated with the mass-coded set of compounds. 
Compounds which bind to the target (ligands) are 
selectively isolated by a size separation step such as gel 
filtration or ultrafiltration. In one embodiment, the 

2 0 mixture of mass -coded compounds and the target biomolecule 
are passed through a size exclusion chromatography column 
(gel filtration) , which separates any ligand-target 
complexes from the unbound compounds. The ligand-target 
complexes are transferred to a reverse-phase chromatography 

25 column, which dissociates the ligands from the target. The 
dissociated ligands are then analyzed by mass spectrometry. 
Mass information provided by the mass spectrometer is 
sufficient for identifying the scaffold and peripheral 
moiety composition of the ligand. This approach is 
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particularly advantageous in situations where 
immobilization of the target may result in a loss of 
activity. 

Once single ligands are identified by the above- 
5 described process, various levels of analysis can be 
applied to yield SAR information and to guide further 
optimization of the affinity, specificity and bioactivity 
of the ligand. For ligands derived from the same scaffold, 
three-dimensional molecular modeling can be employed to 
10 identify significant structural features common to the 
ligands, thereby generating families of small -molecule 
ligands that presumably bind at a common site on the target 
biomolecule . 

In order to identify a consensus, highest affinity, 
15 ligand for a particular binding site, this analysis should 
include a ranking of the members of a given ligand family 
with respect to their affinities for the target. This 
process can provide this information by identifying both 
low and high affinity ligands for a target biomolecule in 
20 one experiment. For example, when the screen utilizes an 
immobilized target, the dissociation rate of the ligand is 
inversely correlated with the number of column volumes 
employed during of the ligand from its target. When the 
screen utilizes the target free in solution, weak affinity 
25 ligands can be selected by using a higher concentration of 
the target. 

Given that each mass -coded set of compounds is 
synthesized with a limited number of peripheral moiety 
precursors, the disclosed approach can, in certain cases, 
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identify a superior ligand which combines structural 
features. of molecules synthesized in separate libraries. 

When possible, the analysis of ligand structural 
features is based on information regarding the target 
5 biomolecule' s structure, wherein the hypothetical consensus 
ligand is computationally docked with the putative binding 
site. Further computational analysis can involve a dynamic 
search of multiple lowest energy conformations, which 
allows comparison of high affinity ligands that are derived 

10 from different scaffolds. The end goal is the 

identification of both the optimal functionality and the 
optimal vectorial presentation of the peripheral moieties 
that yields the highest binding affinity/specificity. This 
may provide the basis for the synthesis of an improved, 

15 second-generation scaffold. 

Due to the modular design of the mass -coded compounds, 
computational analysis may identify the point of attachment 
on the scaffold that has the least functional importance 
with respect to affinity for the target. In many cases, 

20 the ligand will not be completely engulfed by the target 

biomolecule, and one peripheral moiety will be pointed away 
from the biomolecule towards the bulk solvent. Three- 
dimensional alignment of a family of ligands will reveal a 
high degree of functional variability at the site that is 

25 presented to the solvent. Modification at this site can 
then be used to optimize the affinity. For example, the 
noncritical reactive site can be removed and replaced with 
a small unreactive group, such as a hydrogen atom or a 
methyl group. A set of compounds structurally identical 
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except for the peripheral moiety at this position can be 
examined to identify compounds that most effectively 
inhibit or promote the binding of another protein/DNA/RNA 
molecule. Also, the peripheral moiety at this position can 
5 be modified to link two ligands together. The joining of 
two ligands could in certain cases yield a ligand with 
improved affinity and specificity, if one joins molecules 
that bind to adjacent sites, or yield a designed 
biomolecule dimerizer. 

10 A variety of screening approaches can be used to 

obtain ligands that possess high affinity for one target 
but significantly weaker affinity for another closely 
related target. One screening strategy is to identify 
ligands for both biomolecules in parallel experiments and 

15 to subsequently eliminate common ligands by a cross- 
referencing comparison. In this method, ligands for each 
biomolecule ^can be separately identified as disclosed 
above. This method is compatible with both immobilized 
target biomolecules and target biomolecules free in 

2 0 solution. 

For immobilized target biomolecules, another strategy 
is to add a preselection step that eliminates all ligands 
that bind to the non- target biomolecule from the library. 
For example, a first biomolecule can be contacted with a 
25 mass-coded combinatorial library as described above. 

Compounds which do not bond to the first biomolecule are 
then separated from any first biomolecule-ligand complexes 
which form. The second biomolecule is then contacted with 
the compounds which did not bind to the first biomolecule. 
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Compounds which bind to the second biomolecule can be 
identified as described above and have significantly 
greater affinity for the second biomolecule than to the 
first biomolecule. 
5 The screening approach detailed above can also be 

applied to identify ligands that selectively interact with 
an altered version of the same biomolecule, wherein the 
first biomolecule is the unaltered biomolecule and the 
second biomolecule is an altered or variant version of the 

10 biomolecule. The second biomolecule can, for example, have 
an amino acid sequence which differs from the amino acid 
sequence of the first biomolecule by the insertion, 
deletion or substitution of one or more amino acid 
residues. For example, the second biomolecule can include 

15 a specific amino acid mutation that is linked to the 

progression of a particular disease. Alternatively, the 
second biomolecule can also differ from the first 
biomolecule in having a different post-translational 
modification, such as an extra site of phosphorylation or 

20 glycosylation, or it may be truncated or aberrantly fused 
with another biomolecule . 

The screening approach detailed above can also serve 
as a method for identifying small molecule ligands that 
bind at the same site on a biomolecule as another known, 

25 biologically relevant ligand. This known ligand can be 
another biomolecule, such as a protein or peptide, or it 
can be a DNA or RNA molecule, or a substrate or cof actor 
involved in an enzymatic reaction. In one embodiment, the 
first and second biomolecules are both proteins. The first 
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protein is a complex of the protein and the known ligand, 
while the second protein is the protein alone. Compounds 
which bind to the protein alone, but not to the complex of 
the protein with the known ligand, bind to the protein at 
5 the binding site of the known ligand. This approach is 

especially well suited to the development of small molecule 
replacements of known therapeutic ligands, such as peptides 
or proteins . 

An advantage of the present method is that it can be 

10 used to identify chemical compounds that bind tightly to 

any biomolecule of interest, even when the function of that 
biomolecule is not well understood, as is often the case 
with gene products defined through genomics, or when a 
functional assay is not available. The screening 

15 technologies described can be miniaturized to provide 
massive parallel screening capabilities. 

A ligand for a biomolecule of unknown function which 
is identified by the method disclosed above can also be 
used to determine the biological function of the 

20 biomolecule. This is advantageous because although new 

gene sequences continue to be identified, the functions of 
the proteins encoded by these sequences and the validity of 
these proteins as targets for new drug discovery and 
development are difficult to determine and represent 

25 perhaps the most significant obstacle to applying genomic 
information to the treatment of disease. Target -specific 
ligands obtained through the process described in this 
invention can be effectively employed in whole cell 
biological assays or in appropriate animal models to 
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understand both the function of the target protein and the 
validity of the target protein for therapeutic 
intervention. This approach can also confirm that the 
target is specifically amenable to small molecule drug 
5 discovery. The ligands obtained through the process 

described in this invention are small molecules and are, 
thus, similar to actual human therapeutics (small molecule 
drugs) . 

In one embodiment, a member of a combinatorial library 

10 is identified as a ligand for a particular biomolecule 

using the method described above. The ligand can then be 
assessed in an in vitro assay for the effect of the binding 
of the ligand to the biomolecule on the function of the 
biomolecule. For a biomolecule having a known function, 

15 the assay can include a comparison of the activity of the 
biomolecule in the presence and absence of the ligand. If 
the biomolecule is of unknown function, a cell which 
expresses the biomolecule can be contacted with the ligand 
and the effect of the ligand on the viability or function 

20 of the cell is assessed. The in vitro assay can be, for 
example, a cell death assay, a cell proliferation assay or 
a viral replication assay. For example, if the biomolecule 
is a protein expressed by a virus, the a cell infected with 
the virus can be contacted with a ligand for the protein. 

25 The affect of the binding of binding of the ligand to the 
protein on viral viability can then be assessed. 

A ligand identified by the method of the invention can 
also be assessed in an in vivo model or in a human. For 
example, the ligand can be evaluated in an animal or 
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organism which produces the biomolecule. Any resulting 
change in the health status (e.g., disease progression) of 
the animal or organism can be determined. 

For a biomolecule, such as a protein or a nucleic acid 
5 molecule, of unknown function, the effect of a ligand which 
binds to the biomolecule on a cell or organism which 
produces the biomolecule can provide information regarding 
the biological function of the biomolecule. For example, 
the observation that a particular cellular process is 
10 inhibited in the presence of the ligand indicates that the 
process depends, at least in part, on the function of the 
biomolecule . 

The mass-coded libraries provided by the present 
method enable the development of an information set that 

15 describes how the universe of small molecules interacts 
with any biomolecule encoded within the human and other 
genomes. This information set would include data 
regarding: 1) those libraries and components therein which 
bind to the target biomolecule, 2) quantitative structure - 

20 activity relationships (SAR) on chemical functionalities 
which contribute to the binding affinity of a compound for 
a biomolecule target, and 3) the domains of the biomolecule 
that are bound by chemical compounds. The database can be 
used to expedite drug development in a number of ways, for 

25 example, by identifying chemical pharmacophores that 

interact with high affinity with a specific drug binding 
site . 

The invention will now be further and more 
specifically described in the following examples. 
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EXAMPLES 

Example 1 Application of Mass -coding by Computer 

Algorithms: Comparison of Mass-coded and 
Non-Mass -coded Combinatorial Libraries 

The following is an analysis of the application of 
mass -coding algorithms towards the design of combinatorial 
libraries. The sequence of steps involved in identifying 
subsets of peripheral moiety precursors that can be allowed 
to react with a predetermined scaffold precursor to yield a 
mass-coded combinatorial library of compounds with the 
molecular formula X(Y) n is shown in Figure 1A; Figure IB is 
an alternate sequence of steps. It is to be understood that 
the molecular mass sum of the combination of the n 
peripheral moieties in a particular compound of the formula 
X(Y) n is the collective contribution of the n peripheral 
moieties to the molecular mass of the compound. As each 
compound within the library includes a constant scaffold, 
the mass redundancy of the mass-coded library is equivalent 
to the molecular mass sum redundancy of all combinations of 
n peripheral moieties derived from the identified subset of 
peripheral moiety precursors . 

The mass -coding analysis was performed on the initial 
set of 22 peripheral moieties shown below. This initial 
set was selected arbitrarily. Included were peripheral 
moiety precursors having the same exact mass. The master 
set consisted of the peripheral moiety precursors shown 
below, along with the exact masses of the resulting 



WO 99/35109 PCT/US99/00024 

-41- 

peripheral moieties. The molecular masses given are the 
exact molecular masses and not the isotope averages. The 
exact molecular masses are also adjusted for any atoms 
which are lost as a result of the reaction with the 
5 scaffold precursor (in this case the loss of a hydrogen 
atom) . From the initial set of 22 peripheral moiety 
precursors, two sets of 16 peripheral moiety precursors 
were generated. One set was chosen by the computer using 
the mass coding algorithm described herein (computer 

10 selected set) . The other set was randomly chosen. 

From each set of 16 peripheral moiety precursors the 
computer generated every possible subset of 12 peripheral 
moiety precursors. These subsets were used to generate all 
combinations of peripheral moiety precursors taken 4 at a 

15 time (representing libraries synthesized with a scaffold 
precursor having four reactive groups, such as four 
pentaf luorophenyl esters) . This process yielded two sets 
of 16 peripheral moiety precursors containing 182 0 subsets 
of 12 each. Theoretically, these subsets of 12 peripheral 

20 moiety precursors would each yield a library of 1365 
compounds containing different peripheral moiety 
combinations when allowed to react simultaneously with an 
appropriate scaffold precursor containing four reactive 
groups 

25 (15!/[<15-4) !*4i] = 1365). The computer sorted every 
precursor subset and checked for mass redundancy in the 
resultant libraries (in this example mass redundancies were 
checked to the second significant digit after the decimal 
point) . 
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It is noteworthy that the mass coding algorithms and 
the mass redundancy check are both flexible in that it is 
possible to adjust the computational filter to check mass 
redundancy to any significant figure. This architecture 
5 for mass-coding allows for rapid automated mass-coding, 
insures that a significant portion of the libraries 
generated with the computer selected set have less than 10% 
redundancy, and includes parameters for peripheral moiety 
precursor selection outside of exact mass. The 

10 computational requirements for this selection are fairly 
significant. The mass-coding algorithms are essential 
because it is computationally intractable to brute force 
calculate and check every possible set of peripheral moiety 
precursors from a master set of 60 or more peripheral 

15 moiety precursors. 
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HN N \x // 




H 2 N 



78a 140.0034 




79a 210.1283 




O 



86a 230.1756 



104a 116.0711 




NH 2 

94a 186.1494 
108a 101.0351 



RESULTS 

The computer selected set of 16 peripheral moiety 
precursors contained 86a, 79a, 13a, 108a, 76a, 20a, 69a, 
5 la, 70a, 26a, 24a, 36a, 97a, 94a, 104a, and 21a. The set 
of 16 randomly chosen peripheral moiety precursors 
contained 79a, 13a, 20a, 69a, la, 26a, 24a, 104a, 52a, 54a # 
19a, 77a, 53a, 21a, 55a, 36a. The libraries generated from 
the computer selected set of peripheral moiety precursors 
10 had an average mass redundancy of 11.5% per library with 
234 libraries having mass redundancies of less than 5% and 
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972 libraries having mass redundancies of less than 10% 
(Figure 2 A) . The libraries generated from the randomly 
chosen set of peripheral moiety precursors had an average 
mass redundancy of 60.7% with no libraries having a mass 
5 redundancy of less than 10% (Figure 2B) . A direct 

graphical comparison of the mass redundancies of the two 
sets of libraries is shown in Figure 2C. The libraries 
derived from the computer-selected set of peripheral moiety 
precursors and the corresponding mass redundancies are 
10 listed in the Table below. 

Example 2 Development of ligands for a monof unctional 

protein 

A mass-coded combinatorial library can be used to 
identify ligands that have a high affinity for a 

15 monof unctional protein. One such monof unctional protein is 
the serine protease trypsin. Ligands that exhibit a high 
affinity for trypsin would be candidates to screen further 
for their ability to inhibit the proteolytic activity of 
trypsin. The identification of ligands to trypsin involves 

20 the following steps: trypsin is covalently biotinylated by 
incubation of the protein with a chemically activated 
biotin precursor. The biotin- trypsin conjugate is 
immobilized by binding to a streptavidin-derivatized water- 
insoluble column matrix. The mass-coded combinatorial 

25 library is solubilized in an appropriate binding buffer and 
injected onto a column containing the trypsin+streptavidin 
complex. Compounds that do not bind to the column are 



WO 99/35109 



PCT/US99/00024 



-46- 

washed off with binding buffer. Compounds that bind to the 
column are dissociated by a change in the buffer 
conditions, such as a change in the pH or an increase in 
the percentage of organic solvent. These compounds are then 
5 loaded onto a reversed-phase column that is placed 
downstream of the trypsin+streptavidin column. The 
compounds are eluted from the reversed-phase column and 
analyzed by mass spectrometry. Molecular masses that 
correspond to ligands for trypsin are identified by 

10 eliminating those masses which are also observed when the 
library is similarly screened with a streptavidin column. 
The molecular mass of each trypsin ligand identifies one 
combination of peripheral moieties plus scaffold. The 
individual compound or compounds that result from the 

15 identified combination of peripheral moieties plus scaffold 
are synthesized and tested for their in vitro activity as 
inhibitors of trypsin . 

Example 3 Development of ligands for a multifunctional 

protein 

2 0 Many proteins, especially human proteins, are 

multifunctional, and these functions are often mediated 
through interactions with multiple proteins. Ligands that 
bind to different sites on the protein might therefore 
yield different therapeutic results. The human protein 

25 HSP70 is one such example of a multifunctional protein. 
HSP70 has been shown to interact with multiple 
polypeptides, which are largely unfolded, to facilitate 
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their translocation and folding. This role of HSP70 has 
been implicated in a variety of physiological processes, 
including antigen processing/presentation, development of 
certain cancers, and replication of a variety of human 
5 viruses. A mass -coded combinatorial library can be used to 
identify ligands that have a high affinity for HSP70 and 
bind at different sites. These ligands for HSP70 can be 
further evaluated in secondary assays to establish their 
effects on the immune response, cancer progression, and 

10 viral infection. 

The identification of ligands to HSP70 involves the 
following steps: HSP70 is covalently biotinylated by 
incubation of the protein with a chemically activated 
biotin precursor. The biotin-HSP70 conjugate is 

15 immobilized by binding to a streptavidin-derivatized water- 
insoluble column matrix. The mass-coded library is 
solubilized in an appropriate binding buffer and injected 
onto a column containing the HSP7 0-streptavidin complex. 
Compounds that do not bind to the column are washed off 

2 0 with binding buffer. Compounds that bind to the column are 
dissociated by a change in the buffer conditions, such as a 
change in the pH or an increase in the percentage of 
organic solvent. Compounds that are dissociated from the 
column are loaded onto a reversed-phase column that is 

25 placed downstream of the HSP70-streptavidin column. 

Compounds are eluted from the reversed-phase column and 
analyzed by mass spectrometry. Masses that correspond to 
ligands for HSP70 are identified by eliminating those 
masses which are also observed when the library is 
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similarly screened with a streptavidin column. The mass of 
each HSP7 0 ligand identifies one combination of peripheral 
moieties plus scaffold. The individual compound (s) that 
result from the identified combination of peripheral 
5 moieties plus scaffold are synthesized and tested for their 
in vivo ability to affect the immune response, cancer 
progression, and viral infection. 

Example 3 Development of ligands that affect the 

binding of a known ligand to a protein 

10 It is often the situation that a biologically 

important ligand is known for a target protein, but 
development of a high- throughput screen for molecules that 
modulate the binding of that ligand is not practical. For 
instance, it is known that HSP7 0 binds unfolded 

15 polypeptides in the presence of ADP, and that the binding 
of ATP to HSP70 leads to the dissociation of the 
polypeptide. Mass-coded combinatorial libraries can be 
used in the discovery of small molecule ligands that affect 
the binding of ATP, ADP, or unfolded peptides to HSP70 , and 

20 one configuration is listed below: HSP70 is covalently 

biotinylated by incubation of the protein with a chemically 
activated biotin precursor. The biotin-HSP70 conjugate is 
immobilized by binding to a streptavidin-derivatized water- 
insoluble column matrix. The mass-coded library is 

25 solubilized in an appropriate binding buffer and injected 
onto a column containing the HSP70 -streptavidin complex. 
Compounds that do not bind to the column are washed off 
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with binding buffer. Compounds that bind to the column are 
dissociated upon addition of ATP, ADP, or ADP plus an 
unfolded peptide. Only compounds that bind to the same 
sites on HSP7 0 as these known ligands will be eluted under 
5 these conditions. Compounds that are dissociated from the 
column are loaded onto a reversed-phase column that is 
placed downstream of the HSP70-streptavidin column. 
Compounds are eluted from the reversed-phase column and 
analyzed by mass spectrometry. Masses that correspond to 

10 ligands for HSP70 are identified by eliminating those 
masses which are also observed when the library is 
similarly screened with a streptavidin column. The mass of 
each HSP70 ligand identifies one combination of peripheral 
moieties plus scaffold. The individual compound (s) that 

15 result from the identified combination of peripheral 

moieties plus scaffold are synthesized and tested in vitro 
for the ability to compete with these known ligands to 
HSP70 and for their in vivo ability to affect the immune 
response, cancer progression, and viral infection. 

2 0 Example 4 Discovery of small molecule replacements for 

protein therapeutics 

In some instances, the known ligand to a target 
protein is in fact another protein, and the binding of 
these two proteins confers a therapeutic benefit. An 
2 5 example of such an interaction is the binding of 

granulocyte colony stimulating factor (G-CSF) to the G-CSF 
receptor (G-CSF-R) . Replacement of G-CSF with a non- 



WO 99/35109 



PCT/US99/00024 



-50- 

peptide 'small molecule can be undertaken using a mass-coded 
combinatorial library, and one approach is detailed below: 
in two separate and parallel experiments, the mass -coded 
library is solubilized in an appropriate binding buffer and 
5 incubated with either the G-CSF-R alone or the G-CSF-R plus 
G-CSF. Compounds that bind to the protein (s) are separated 
from the unbound compounds by rapid size exclusion 
chromatography. The binding compounds are loaded with the 
protein (s) onto a reversed-phase column that is placed 

10 downstream of the size exclusion column. The binding 
compounds are dissociated from the protein (s) and are 
eluted from the reversed-phase column and analyzed by mass 
spectrometry. Masses that correspond to compounds that 
bind to the G-CSF/G-CSF-R interface are identified as those 

15 masses which are only observed when the library is screened 
with G-CSF-R alone; masses which are also observed in the 
screen with the G-CSF/G-CSF-R complex are ignored. The 
mass of each interface-specific compound identifies one 
combination of peripheral moieties plus scaffold. The 

20 individual compound (s) that result from the identified 

combination of peripheral moieties plus scaffold are then 
synthesized and tested for their in vitro or in vivo 
ability to mimic G-CSF. 

Example 5 Development of small molecules that dimerize 

25 two proteins 

Certain therapeutic proteins, such as erythropoietin 
(EPO) , are multivalent and act by binding two molar 
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equivalents of the target protein, thereby dimerizing the 
target protein, which, in the case of EPO is the EPO 
receptor (EPO-R) . The protein replacement strategy 
outlined in Example 3 can be extended to yield non-peptide 
5 compounds that act therapeutically by inducing the 

dimerization of two EPO-R molecules. In two separate and 
parallel experiments, the mass-coded library is solubilized 
in an appropriate binding buffer and incubated with either 
EPO-R alone or EPO-R plus EPO. Compounds that bind to the 
10 protein (s) are separated from the unbound compounds by 

rapid size exclusion chromatography. The bound compounds 
are loaded with the protein (s) onto a reversed-phase column 
that is placed downstream of the size exclusion column. 
The bound compounds are dissociated from the protein (s) and 
15 are eluted from the reversed-phase column and analyzed by 
mass spectrometry. Masses that correspond to compounds 
that bind to the EPO/EPO-R interface are identified as 
those masses which are observed only when the library is 
screened with EPO-R alone; masses which are also observed 
2 0 in the screen with the EPO/EPO-R complex are ignored. The 
mass of each interface-specific compound identifies one 
combination of peripheral moieties plus scaffold. The 
individual compound (s) that result from the identified 
combination of peripheral moieties plus scaffold are 
25 synthesized and tested for their in vitro ability to bind 
to the target protein, EPO-R. Those compounds exhibiting 
the highest affinity for the target protein are compared to 
identify similarities among them. Ideally, it is observed 
that one site of derivatization on the scaffold is 
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relatively unimportant for high affinity binding. The 
peripheral moiety at this site is subsequently replaced 
with a covalent tether that joins two molecules of the 
highest affinity compound to yield a non-peptide compound 
5 that dimerizes the target protein, EPO-R. 

Example 6 Simultaneous target validation and small - 

molecule drug discovery 

An example of a class of target proteins whose roles 
in a disease process can be validated by application of 

10 target-specific ligands to a bioassay are the proteins 
encoded by the open reading frames (ORF) of the Herpes 
Simplex Virus. The identification of ligands to an ORF- 
encoded protein and the use of the resulting ligands to 
determine the function of the ORF-encoded protein and its 

15 validity as a target for anti-viral drug discovery involves 
the following steps : the ORF-encoded protein is covalently 
biotinylated by incubation of the ORF-encoded protein with 
a chemically activated biotin precursor. The ORF-encoded 
protein-biotin conjugate is immobilized by binding to a 

20 streptavidin-derivatized water-insoluble column matrix. 
The mass-coded library is solubilized in an appropriate 
binding buffer and injected onto a column containing the 
ORF-encoded protein+streptavidin complex. Compounds that 
do not bind to the column are washed off with binding 

2 5 buffer. Compounds that bind to the column are dissociated 
by a change in the buffer conditions, such as a change in 
the pH or an increase in the percentage of organic solvent . 
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These compounds are loaded onto a reversed-phase column 
placed downstream of the ORF-encoded protein+streptavidin 
column. The binding compounds are eluted from the 
reversed-phase column and analyzed by mass spectrometry. 
5 Molecular masses that correspond to ligands for the ORF- 
encoded protein are identified by eliminating those masses 
that are also observed when the library is similarly 
screened with a streptavidin column. The molecular mass of 
each ligand for the ORF-encoded protein identifies one 

10 combination of peripheral moieties plus scaffold. The 
individual compound (s) that result from the identified 
combination of peripheral moieties plus scaffold are 
synthesized and tested for their ability to inhibit the 
replication or transmission of the virus in a mammalian 

15 cell bioassay or animal model. 

The observation of a virus-specific inhibitory 
activity implicates the ORF-encoded protein as a critical 
component of the viral disease process and confirms that 
the ORF-encoded protein is specifically amenable to small 

20 molecule anti -viral drug discovery. Observation of a 

direct correlation between the relative binding affinities 
of the ORF-encoded protein- specif ic ligands and the 
relative inhibitory concentrations of the ORF-encoded 
protein-specific ligands further strengthens the 

25 identification of the ORF-encoded protein as a target for 
small molecule anti -viral drug discovery. 
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Example 7 Development of small molecules that can be 

applied to the affinity purification of a 
target protein 

A mass -coded combinatorial library can be used to 
5 identify ligands that have a high affinity for a target 
protein. One such target protein is human erythropoietin 
(EPO) , which is expressed and purified industrially for use 
as a therapeutic drug. Ligands that exhibit a high 
affinity for EPO can be immobilized on a solid support to 

10 generate an EPO-specific affinity matrix. 

The identification of ligands to EPO and the 
construction of an EPO-specific affinity matrix involves 
the following steps: the mass coded library is solubilized 
in an appropriate binding buffer and incubated with the EPO 

15 protein. Compounds that bind to the EPO protein are 
separated from the unbound compounds by rapid size 
exclusion chromatography. These compounds are loaded with 
the EPO protein onto a reversed-phase column that is placed 
downstream of the size exclusion column. The compounds are 

20 dissociated from the EPO protein and are eluted from the 
reversed-phase column and analyzed by mass spectrometry. 
The molecular mass of each EPO protein-specific ligand 
identifies one combination of peripheral moieties plus 
scaffold. The individual ligand (s) that result from the 

25 identified combination of peripheral moieties plus scaffold 
are synthesized and tested for their in vitro ability to 
bind to the EPO protein. Compounds exhibiting the highest 
affinity for the EPO protein are compared to identify 
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similarities between the compounds. If it is observed that 
one reactive site on the scaffold is relatively unimportant 
for high affinity binding, the peripheral moiety at this 
site is subsequently replaced with a covalent tether that 
5 joins the EPO-specific ligand to a water insoluble matrix, 
thereby generating an EPO-specific affinity matrix. 

Alternatively, the covalent tether is used to join the 
EPO-specific ligand to a another molecule, such as biotin, 
which possesses a high affinity for a commercially 
10 available affinity matrix (streptavidin-derivatized 

agarose) . The biotin-streptavidin interaction is used as a 
strong, non-covalent immobilization technique. 

Example 8 Development of small molecules that can be 

applied to the visualization of a target 
15 protein 

A mass-coded combinatorial library can be used to 
identify ligands that have a high affinity for a target 
protein. One such target protein is the human protein 
telomerase, the expression of which is linked to cancer 

20 progression and aging. Ligands that exhibit a high 
affinity for telomerase can be f unctionalized with a 
radioactive or non-radioactive tag to thereby generate a 
telomerase-specif ic affinity probe for visualization of the 
enzyme in vitro or in vivo. The identification of ligands 

25 to telomerase and the construction of a telomerase-specif ic 
affinity probe would involve the following steps: a mass- 
coded library is solubilized in an appropriate binding 
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buffer and incubated with telomerase protein alone. 
Compounds that bind to the telomerase protein are separated 
from the unbound compounds by rapid size exclusion 
chromatography. The binding compounds are loaded with the 
5 telomerase protein onto a reversed-phase column that is 
placed downstream of the size exclusion column. The 
compounds are dissociated from the telomerase protein and 
are eluted from the reversed-phase column and analyzed by 
mass spectrometry .mass of each telomerase protein-specific 

10 ligand identifies one combination of peripheral moieties 
plus scaffold. The individual ligand (s) that result from 
the identified combination of peripheral moieties plus 
scaffold are synthesized and tested for their in vitro 
ability to bind to the telomerase protein. Compounds 

15 exhibiting the highest affinity for the telomerase protein 
are compared to identify similarities between the 
compounds. Ideally, it is observed that one reactive site 
on the scaffold is relatively unimportant for high affinity 
binding. The peripheral moiety at this site is 

20 subsequently replaced with a covalent tether that joins the 
telomerase-specif ic ligand to a radioactive moiety or a 
non-radioactive moiety such as a fluorophore, thereby 
generating a telomerase-specif ic affinity probe. 
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Example 9 Identification of a small molecule inhibitor 

of bovine trypsin by affinity selection with 
a mass coded combinatorial library mixture 



A mass -coded library mixture was synthesized by the 
5 reaction of the scaffold precursor 




CI CI 



with a set of ten peripheral moiety precursors selected as 
described in Example 1. This set of peripheral moiety 
precursors is shown below. The peripheral moiety 

10 • precursors were selected to yield a mass coded 

combinatorial library mixture of compounds of the formula 
X(Y) 4 , where X is this scaffold and each Y is, 
independently, a peripheral moiety derived from one of the 
peripheral moiety precursors. This library includes 715 

15 distinct combinations of 4 peripheral moieties and 715 
distinct masses. 
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A homogeneous solution containing 2 mM mass coded 
library mixture and 50 juM bovine trypsin in binding buffer 
(50 mM Tris, pH 7.75; 40 mM CaCl 2 ; 10% DMSO total v/v) was 
5 incubated for 3 0 minutes at room temperature and then 

incubated on ice for 5 minutes. 20 ^uL of this mixture was 
injected onto a 4.6x200 mm size-exclusion HPLC column and 
eluted with binding buffer at 1.5 ml/min. The protein 
eluted slightly before the unbound library components, and 

10 this protein peak was collected. Formic acid and 

acetonitrile were added to final concentrations of 10% each 
to dissociate any ligands from the protein, and the 
resultant mixture was analyzed by LC-MS. Mass 
spectroscopic analysis yielded one mass which matched with 

15 one combination of 4 peripheral moieties plus scaffold, and 
synthesis of a single isomer having this combination 
confirmed the identity of the trypsin ligand, which is 
shown below. 
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This molecule was a potent inhibitor of trypsin when 
assayed by a standard in vitro trypsin activity assay. 

EQUIVALENTS 

5 While this invention has been particularly shown and 

described with references to preferred embodiments thereof, 
it will be understood by those skilled in the art that 
various changes in form and details may be made therein 
without departing from the spirit and scope of the 

10 invention as defined by the appended claims. Those skilled 
in the art will recognize or be able to ascertain using no 
more than routine experimentation, many equivalents to the 
specific embodiments of the invention described 
specifically herein. Such equivalents are intended to be 

15 encompassed in the scope of the claims. 
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CLAIMS 

We claim: 

1. A method for producing a mass-coded set of compounds 
of the general formula X(Y) n/ wherein X is a scaffold, 
5 n is from 2 to about 6, and each Y is, independently, 

a peripheral moiety, comprising the steps of: 

(a) selecting a peripheral moiety precursor subset 
from a peripheral moiety precursor set, said 
subset comprising a sufficient number of 

10 peripheral moiety precursors that there exist at 

least about 250 distinct combinations of n 
peripheral moieties derived from said subset, 
wherein at least about 90% of said combinations 
of n peripheral moieties derived from said subset 

15 have molecular mass sums which are distinct from 

the molecular mass sums of all other combinations 
of n peripheral moieties derived from said 
subset; and 

(b) contacting said peripheral moiety precursor 

20 subset with a scaffold precursor, said scaffold 

precursor having n reactive groups, wherein each 
reactive group is capable of reacting with at 
least one peripheral moiety precursor to form a 
covalent bond, under conditions sufficient for 
25 the reaction of each reactive group with a . 

peripheral moiety precursor, 
thereby producing a mass-coded set of compounds of. the 
general formula X{Y) n . 



WO 99/35109 



PCT/US99/00024 



-99- 



2 . The method of Claim 1 wherein the scaffold precursor 
comprises one or more saturated, partially unsaturated 
or aromatic cyclic groups. 

3. The method of Claim 2 wherein at least one cyclic 

5 group is substituted by one or more reactive groups. 

4. The method of Claim 3 wherein the reactive groups are 
attached to the cyclic group directly or via an 
intervening C^-alkylene group. 

5. The method of Claim 4 wherein each reactive group is 
10 independently selected from the group consisting of: 

reactive carbonyl groups, reactive sulfonyl groups, 
reactive phosphonyl groups, terminal epoxide group and 
the isocyanate group. 

6. The method of Claim 5 wherein the reactive group is 
15 selected from the group consisting of: carbonyl 

chloride, carbonyl pentaf luorophenyl ester and 
sulfonyl chloride . 

7. The method of Claim 5 wherein at least one peripheral 
moiety precursor comprises a primary amino group, a 

2 0 secondary amino group or a hydroxyl group. 

8. The method of Claim 4 wherein each reactive group is 
independently selected from the group consisting of: 
primary amino, secondary amino and hydroxyl. 
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9. The method of Claim 8 wherein at least one peripheral 
moiety precursor comprises a reactive carbonyl group, 
reactive sulfonyl group, reactive phosphonyl group, 
terminal epoxide group or an isocyanate group. 

5 10. The method of Claim 9 wherein at least one peripheral 
moiety precursor comprises a carbonyl chloride, a 
carbonyl pentaf luorophenyl ester or a sulfonyl 
chloride group. 

11. A method as claimed in Claim 1 wherein the step of 
10 selecting includes the steps of: 

(a) choosing every set of two different peripheral 
moiety precursors from the peripheral moiety 
precursor set, said choosing performed in a 
manner such that for each set of two, if the two 

15 peripheral moiety precursors have equal molecular 

masses then one of the two is removed forming a 
remaining set ; 

(b) from the remaining set, choosing every set of 
four peripheral moiety precursors, including for 

2 0 a given set of four, removing one of the four 

peripheral moiety precursors if a sum of the 
molecular masses of a first two precursors in the 
given set of four equals a sum of the molecular 
masses of a second two precursors in the given 

2 5 set of four peripheral moiety precursors, said 

choosing forming a remainder set; 
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(c) from the remainder set, choosing every set of six 
different peripheral moiety precursors, including 
for a given set of six, removing one of the six 
peripheral moiety precursors if a sum of the 

5 molecular masses of a first three precursors in 

the given set of six equals a sum of the 
molecular masses of a second three precursors in 
the given set of six, said choosing forming a 
working selection set of peripheral moiety 
10 precursors from which to select a desired subset; 

and 

(d) from the working selection set, choosing a 
desired subset so as to provide the selected 
subset by 

15 (i) choosing a possible selected subset from the 

working selection set, 
(ii) from the chosen possible subset, generating 
all possible combinations of n peripheral 
moiety precursors, and 

20 (iii) determining whether the generated 

combinations have an acceptable percent mass 
redundancy, and if so, selecting the chosen 
possible subset as the selected subset. 



12 . 

25 



A method as claimed in Claim 11 wherein the step of 
selecting is performed by a digital processor 
assembly. 
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13 . A method as claimed in Claim 1 wherein the step of 
selecting includes selecting a peripheral moiety 
precursor subset which includes peripheral moiety 
precursors that simultaneously produce mass-coded . 

5 compounds when contacted with a scaffold precursor. 

14. A method as claimed in Claim 13 wherein the step of 
selecting comprises the steps of: 

(a) choosing every set of two different peripheral 
moiety precursors from the peripheral moiety 
10 precursor set, said choosing performed in a 

manner such that for each set of two, if the two 
peripheral moiety precursors have equal molecular 
masses then one of the two is removed forming a 
remaining set; 

15 (b) from the remaining set, choosing every set of 

four different peripheral moiety precursors, 
including for a given set of four, removing one 
of the four peripheral moiety precursors if a sum 
of the molecular masses of a first two precursors 

2 0 in the given set of four equals a sum of the 

molecular masses of a second two precursors in 
the given set of four peripheral moiety 
precursors, said choosing forming a remainder 
set; 

2 5 (c) from the remainder set, choosing every set of six 

different peripheral moiety precursors, including 
for a given set of six, removing one of the six 
peripheral moiety precursors if a sum of the 
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molecular masses of a first three precursors in 
the given set of six equals a sum of the 
molecular masses of a second three precursors in 
the given set of six, said choosing forming a 
5 working selection set of peripheral moiety 

precursors from which to select a desired subset; 
and 

(d) from the working selection set, choosing a 

desired subset so as^to provide the selected 
10 subset by 

(i) choosing a possible selected subset from the 
working selection set, 

(ii) from the chosen possible subset, generating 
all possible combinations of n peripheral 

15 moiety precursors, and 

(iii) determining whether the generated 
combinations have an acceptable percent mass 
redundancy, and if so, selecting the chosen 
possible subset as the selected subset. 

2 0 15. A method as claimed in Claim 14 wherein the step of 
selecting is performed by a digital processor 
assembly. 

16. A method for identifying a member of a. mass-coded 
combinatorial library which is a ligand for a 
25 biomolecule, said mass-coded molecular library 

comprising compounds of the general formula XY n , 
wherein n is an integer from 2 to about 6, X is a 
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scaffold and each Y is, independently, a peripheral 
moiety, wherein said mass -coded combinatorial library 
is produced by reacting a scaffold precursor with a 
sufficient number of distinct peripheral moiety 
5 precursors such that there exist at least about 250 

distinct combinations of n peripheral moieties derived 
from said peripheral moiety precursors, said method 
comprising the steps of: 

(a) contacting the biomolecule with the mass-coded 
10 molecular library, whereby members of the mass- 

coded molecular library which are ligands for the 
biomolecule bind to the biomolecule to form 
biomolecule -ligand complexes and members of the 
mass-coded library which are not ligands for the 
15 biomolecule remain unbound; 

(b) separating the biomolecule -ligand complexes from 
the unbound members of the mass -coded molecular 
library; 

(c) dissociating the biomolecule-ligand complexes; 
20 and 

(d) determining the molecular mass of each ligand to 
identify the set of n peripheral moieties present 
in each ligand, 

wherein the molecular mass of each ligand corresponds 
25 to a set of n peripheral moieties present in that 

ligand, thereby identifying a member of the mass-coded 
combinatorial library which is a ligand for the 
biomolecule . 
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17. The method of Claim 16 wherein the biomolecule is 
immobilized on a solid support. 

18. The method of Claim 17 wherein the solid support is a 
water- insoluble matrix contained within a 

5 chromatographic column. 

19. The method of Claim 16 wherein a solution comprising 
the biomolecule is contacted with the mass-coded 
molecular library to form, if one or members of the 
mass -coded molecular library are ligands for the 

10 biomolecule, a solution comprising biomolecule -ligand 

complexes and unbound members of the mass -coded 
molecular library. 

20. The method of Claim 19 wherein the unbound members of 
t h e mass-coded molecular library are separated from 

15 the biomolecule -ligand complexes by directing the 

solution comprising biomolecule-ligand complexes and 
the unbound members of the mass -coded molecular 
library through a size exclusion chromatography 
column, whereby the unbound members of the mass -coded 

2 0 molecular library elute from said column after the 

biomolecule-ligand complexes. 

21. The method of Claim 19 wherein the unbound members of 
the mass -coded molecular library are separated from 
the biomolecule-ligand complexes by contacting the 

25 solution comprising biomolecule-ligand complexes and 
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the unbound members of the mass -coded molecular 
library with a size-exclusion membrane, whereby the 
unbound combounds pass through said membrane and the 
biomolecule- ligand complexes do not pass through said 
5 membrane . 

22, The method of Claim 16 wherein the biomolecule is a 
protein or a nucleic acid molecule. 

23. A method for identifying a member of a mass-coded 
combinatorial library which is a ligand for a 

10 biomolecule and assessing the the effect of the 

binding of the ligand to the biomolecule, said mass- 
coded molecular library comprising compounds of the 
general formula XY n/ wherein n is an integer from 2 to 
about 6, X is a scaffold and each Y is, independently, 

15 a peripheral . moiety, wherein said mass-coded 

combinatorial library is produced by reacting a 
scaffold precursor with a sufficient number of 
distinct peripheral moiety precursors such that there 
exist at least about 250 distinct combinations of n 

2 0 peripheral moieties derived from said peripheral 

moiety precursors, said method comprising the steps 
of: 

(a) contacting the biomolecule with the mass-coded 
molecular library, whereby members of the mass- 
25 coded molecular library which are ligands for the 

biomolecule bind to the biomolecule to form 
biomolecule-ligand complexes and members of the 
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mass- coded library which are not ligands for the 
biomolecule remain unbound; 

(b) separating the biomolecule-ligand complexes from 
the unbound members of the mass-coded molecular 

5 library; 

(c) dissociating the biomolecule-ligand complexes; 

(d) determining the molecular mass of each ligand to 
identify the set of n peripheral moieties present 
in each ligand, 

10 wherein the molecular mass of each ligand corresponds 

to a set of n peripheral moieties present in that 
ligand, thereby identifying a member of the mass-coded 
combinatorial library which is a ligand for the 
biomolecule; and 
15 (e) assessing in an in vitro assay the effect of the 

binding of the ligand to the biomolecule on the 
function of the biomolecule . 



24. The method of Claim 23 wherein the in vitro assay is a 
cell proliferation assay, a cell death assay or a 

20 viral replication assay. 

25. The method of Claim 23 wherein the biomolecule is a 
protein or a nucleic acid molecule, 

26. A method for identifying a member of a mass-coded 
combinatorial library which is a ligand for a 

25 biomolecule and assessing the the effect of the 

binding of the ligand to the biomolecule, said mass- 
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coded molecular library comprising compounds of the 
general formula XY n , wherein n is an integer from 2 to 
about 6, X is a scaffold and each Y is, independently, 
a peripheral moiety, wherein said mass-coded 
5 combinatorial library is produced by reacting a 

scaffold precursor with a sufficient number of 
distinct peripheral moiety precursors such that there 
exist at least about 250 distinct combinations of n 
peripheral moieties derived from said peripheral 
10 moiety precursors, said method comprising the steps 

of: 

(a) contacting the biomolecule with the mass-coded 
molecular library, whereby members of the mass- 
coded molecular library which are ligands for the 

15 biomolecule bind to the biomolecule to form 

biomolecuie-ligand complexes and members of the 
mass -coded library which are not ligands for the 
biomolecule remain unbound; 

(b) separating the biomolecuie-ligand complexes from 
20 the unbound members of the mass -coded molecular 

library; 

(c) dissociating the biomolecuie-ligand complexes; 

(d) determining the molecular mass of each ligand to 
identify the set of n peripheral moieties present 

25 in each ligand, 

wherein the molecular mass of each ligand corresponds 
to a set of n peripheral moieties present in that 
ligand, thereby identifying a member of the mass-coded 
combinatorial library which is a ligand for the 
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biomolecule; and 

(e) assessing in an in vivo assay the effect of the 
binding of the ligand to the biomolecule on the 
function of the biomolecule. 

5 27. The method of Claim 26 wherein the effect of the 
binding of the ligand to the biomolecule on the 
function of the biomolecule is assessed in an animal 
model, in an organism or in a human. 

28. The method of Claim 26 wherein the biomolecule is a 
10 protein or a nucleic acid. molecule . 

29. A method for identifying a member of a mass-coded 
molecular library which is a ligand for, a -biomplepule * 
and bind to the biomolecule at the binding, site op a 
known second ligand for the biomolecule, said mass- 

15 coded molecular library comprising compounds of the 

general formula XY n , wherein n is an .integer from 2 to 
about 6, X is a scaffold and each Y is, independently, 
a peripheral moiety, wherein said mass -coded molecular 
library is produced by reacting a scaffold precursor 

20 with a sufficient number of distinct peripheral moiety 

precursors such that there exist at least about 250 
distinct combinations of n peripheral moieties derived 
from said peripheral moiety precursors, said method 
comprising the steps of: 

25 (a) contacting the biomolecule with the mass-coded 

molecular library, whereby members of the mass- 
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coded molecular library which are ligands for the 
biomolecule bind to the biomolecule to form 
biomolecule-ligand complexes and members of the 
mass -coded library which are not ligands for the 
5 biomolecule remain unbound; 

(b) separating the biomolecule-ligand complexes from 
the unbound members of the mass-coded molecular 
library; 

(c) contacting the biomolecule-ligand complexes with 
10 the second ligand to dissociate biomolecule- 
ligand complexes in which the ligand binds to the 
biomolecule at the binding site of the second 
ligand, thereby forming biomolecule-second ligand 
complexes and dissociated ligands; 

15 (d) separating the dissociated ligands and 

biomolecule-ligand complexes; and 
(e) determining the molecular mass of each 

dissociated ligand, 
wherein the molecular mass of each dissociated ligand 

20 corresponds to a set of peripheral moieties present in 

that ligand, thereby identifying a member of the mass- 
coded molecular library which is a ligand for the 
biomolecule and binds to the biomolecule at the 
binding site of the known second ligand for the 

25 biomolecule. 



30. The method of Claim 29 wherein the second ligand is a 
polypeptide, a nucleic acid molecule or a cofactor. 
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31. The method of Claim 29 wherein wherein the biomolecule 
is immobilized on a solid support. 

32. The method of Claim 31 wherein the solid support is a 
water-insoluble matrix contained within a 

5 chromatographic column. 

33. The method of Claim 29 wherein the biomolecule is a 
protein or a nucleic acid molecule. 

34. A method for identifying a member of a mass-coded 
combinatorial library which is a ligand for a first 

10 biomolecule but is not a ligand for a second 

biomolecule, said mass-coded molecular library 
comprising compounds of the general formula XY n/ 
wherein n is an integer from 2 to about 6, X is a 
scaffold and each Y is, independently, a peripheral 

15 moiety, wherein said mass-coded combinatorial library 

is produced by reacting a scaffold precursor with a 
sufficient number of distinct peripheral moiety 
precursors such that there exist at least about 250 
distinct combinations of n peripheral moieties derived 

20 from said peripheral moiety precursors, said method 

comprising the steps of: 

(a) contacting the first biomolecule with the mass- 
coded molecular library, whereby members of the 
mass-coded molecular library which are ligands 
25 for the first biomolecule bind to the first 

biomolecule to form first biomolecule- ligand 
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complexes and members of the mass -coded library 
which are not ligands for the first biomolecule 
remain unbound; 

(b) separating the first biomolecule-ligand complexes 
5 from the unbound members of the mass -coded 

molecular library; 

(c) dissociating the first biomolecule-ligand 
complexes; 

(d) determining the molecular mass of each ligand for 
10 the first biomolecule ; 

(e) contacting the second biomolecule with the mass- 
coded molecular library, whereby members of the 
mass -coded molecular library which are ligands 
for the second biomolecule bind to the second 

15 biomolecule to form second biomolecule-ligand 

complexes and members of the mass-coded library 
which are not ligands for the second biomolecule 
remain unbound; 

(f ) separating the second biomolecule-ligand 

2 0 complexes from the unbound members of the mass- 

coded molecular library; 

(g) dissociating the second biomolecule-ligand 
complexes; 

(h) determining the molecular mass of each ligand for 
25 the second biomolecule; and 

(i) determining which molecular mass or masses 
determined in step (d) are not determined in step 
(h) , thereby providing the molecular masses of 
members of the mass -coded combinatorial library 
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which are ligands for the first biomolecule but 
are not ligands for the second biomolecule, 
wherein the each molecular mass determined in step (i) 
corresponds to a set of n peripheral moieties present 
5 in a ligand for the first biomolecule which is not a 

ligand for the second biomolecule, thereby identifying 
a member of the mass -coded combinatorial library which 
are ligands for the first biomolecule but are not 
ligands for the second biomolecule. 

10 35. The method of Claim 34 wherein the first and second 
biomolecules are each, independently, a protein or a 
nucleic acid molecule. 

36. The method of Claim 35 wherein the first and second 
biomolecules are each a protein and amino acid 
15 sequence of the second biomolecule is derived from the 

amino acid sequence of the first biomolecule by 
insertion, deletion or substitution of one or more 
amino acid residues. 



37. The method of Claim 35 wherein the first biomolecule 
20 is a first protein and the second biomolecule is a 

second protein, said first and second proteins having 
the same amino acid sequence, wherein said first and 
second proteins have different posttranslational 
modifications . 
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38. The method of Claim 37 wherein the first protein 
differs from the second protein in extent of 
phosphorylation, glycosylation or ubiquitination . 



39. The method of Claim 3 5 wherein the second biomolecule 
5 is a complex of the first biomolecule with a ligand. 

40. The method of Claim 35 wherein the first and second 
biomolecules are each immobilized on a solid support. 

41. The method of Claim 40 wherein the solid support is a 
water-insoluble matrix contained within a 

10 chromatographic column. 

42. The method of Claim 35 wherein a solution comprising 
the first biomolecule is contacted with the mass-coded 
molecular library to form a solution comprising first 
biomolecule- ligand complexes and unbound members of 

15 the mass -coded molecular library and a solution 

comprising the second biomolecule is contacted with 
the mass -coded molecular library to form a solution 
comprising second biomolecule-ligand complexes and 
unbound members of the mass-coded molecular library. 



20 43. The method of Claim 42 wherein the unbound members of 
the mass-coded molecular library are separated from 
the second biomolecule-ligand complexes by directing 
the solution comprising second biomolecule-ligand 
complexes and the unbound members of the mass -coded 
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molecular library through a size exclusion 
chromatography column, whereby the unbound members of 
the mass -coded molecular library elute from said 
column after the second biomolecule-ligand complexes. 

5 44. The method of Claim 42 wherein the unbound members of 
the mass-coded molecular library are separated from 
the second biomolecule-ligand complexes by contacting 
the solution comprising second biomolecule-ligand 
complexes and the unbound members of the mass-coded 
10 molecular library with a size-exclusion membrane, 

whereby the unbound combounds pass through said 
membrane and the second biomolecule-ligand complexes 
do not pass through said membrane. 

45. A method for identifying a member of a mass-coded 
15 combinatorial library which is a ligand for a first 

biomolecule but is not a ligand for a second 
biomolecule, said mass-coded molecular library 
comprising compounds of the general formula XY n/ 
wherein n is an integer from 2 to about 6, X is a 
20 scaffold and each Y is, independently , a peripheral 

moiety, wherein said mass-coded combinatorial library 
is produced by reacting a scaffold precursor with a 
sufficient number of distinct peripheral moiety 
precursors such that there exist at least about 250 
25 distinct combinations of n peripheral moieties derived 

from said peripheral moiety precursors, said method 
comprising the steps of: 
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(a) contacting the second biomolecule with the mass- 
coded molecular library, whereby members of the 
mass-coded molecular library which are ligands 
for the second biomolecule bind to the second 

5 biomolecule to form second biomolecule-ligand 

complexes and members of the mass -coded library 
which are not ligands for the second biomolecule 
remain unbound; 

(b) separating the second biomolecule-ligand 

10 complexes from the unbound members of the mass- 

coded molecular library; 

(c) contacting the first biomolecule with the unbound 
members of the mass-coded molecular library of 
step (b) , whereby members of the mass-coded 

15 molecular library which are ligands for the first 

biomolecule bind to the first biomolecule to form 
first biomolecule-ligand complexes and members of 
the mass-coded library which are not ligands for 
the first biomolecule remain unbound; 

20 (d) dissociating the first biomolecule-ligand 

complexes ; 

(e) determining the molecular mass of each ligand for 

the first biomolecule; 
wherein each molecular mass determined in step (e) 
2 5 corresponds to a set of n peripheral moieties present 

in a ligand for the first biomolecule which is not a 
ligand for the second biomolecule, thereby identifying 
a member of the mass-coded combinatorial library which 
is a ligand for the first biomolecule but are is not a 
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ligand for the second biomolecule. 

46 . The method of Claim 45 wherein the first and second 
biomolecules are each, independently , a protein or a 
nucleic acid molecule. 

5 47 . The method of Claim 45 wherein the second biomolecule 
is immobilized on a solid support. 

48. The method of Claim 47 wherein the solid support is a 
water-insoluble matrix contained within a 
chromatographic column . 

10 49. Apparatus for producing a mass-coded set of compounds 
of the general formula X(Y) n , wherein X is a scaffold, 
n is from 2 to about 6, and each Y is, independently, 
a peripheral moiety, comprising 
a digital processor assembly for selecting a 

15 peripheral moiety precursor subset from a peripheral 

moiety precursor set, said subset comprising a 
sufficient number of peripheral moiety precursors that 
there exist at least about 250 distinct combinations 
of n peripheral moieties derived from said subset, 

20 wherein at least about 90% of said combinations of n 

peripheral moieties derived from said subset have 
molecular mass sums which are distinct from the 
molecular mass sums of all other combinations of n 
peripheral moieties derived from said subset. 
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50. Apparatus as claimed in Claim 49 wherein the digital 
processor assembly employs a routine executed by a 
digital processor to: 

(a) choose every set of two different peripheral 
5 moiety precursors from the peripheral moiety 

precursor set, said choosing performed in a 
manner such that for each set of two, if the two 
peripheral moiety precursors have equal molecular 
masses then one of the two is removed forming a 
10 remaining set; 

(b) from the remaining set, choose every set of four 
peripheral moiety precursors, including for a 
given set of four, removing one of the four 
peripheral moiety precursors if a sum of the 

15 molecular masses of a first two precursors in the 

given set of four equals a sum of the molecular 
masses of a second two precursors in the given 
set of four peripheral moiety precursors, said 
choosing forming a remainder set; 

2 0 (c) from the remainder set, choose every set of six 

different peripheral moiety precursors, including 
for a given set of six, removing one of the six 
peripheral moiety precursors if a sum of the 
molecular masses of a first three precursors in 

2 5 the given set of six equals a sum of the 

molecular masses of a second three precursors in 
the given set of six, said choosing forming a 
working selection set of peripheral moiety 
precursors from which to select a desired subset; 
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(d) from the working selection set, choose a desired 
subset so as to provide the selected subset by 

(i) choosing a possible selected subset from the 
5 working selection set, 

(ii) from the chosen possible subset, generating 
all possible combinations of n peripheral 
moiety precursors, and 

(iii) determining whether the generated 

10 combinations have an acceptable percent mass 

redundancy, and if so, selecting the chosen 
possible subset as the selected subset. 
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